Multi-Robot Inverse Reinforcement Learning Under Occlusion with State Transition Estimation
نویسندگان
چکیده
Multi-robot inverse reinforcement learning (mIRL) is broadly useful for learning, from passive observations, the behaviors of multiple robots executing fixed trajectories and interacting with each other. In this paper, we relax a crucial assumption in IRL to make it better suited for wider robotic applications: we allow the transition functions of other robots to be stochastic and do not assume that the transition error probabilities are known to the learner. Challenged by occlusion where large portions of others’ state spaces are fully hidden, we present a new approach that maps stochastic transitions to distributions over features.
منابع مشابه
Toward Estimating Others' Transition Models Under Occlusion for Multi-Robot IRL
Multi-robot inverse reinforcement learning (mIRL) is broadly useful for learning, from observations, the behaviors of multiple robots executing fixed trajectories and interacting with each other. In this paper, we relax a crucial assumption in IRL to make it better suited for wider robotic applications: we allow the transition functions of other robots to be stochastic and do not assume that th...
متن کاملMulti-robot inverse reinforcement learning under occlusion with interactions
We consider the problem of learning the behavior of multiple mobile robots executing fixed trajectories in a common space and possibly interacting with each other in their execution. The mobile robots are observed by a subject robot from a vantage point from which it can observe a portion of their trajectories only. This problem exhibits wide-ranging applications and the specific application we...
متن کاملUtilizing Generalized Learning Automata for Finding Optimal Policies in MMDPs
Multi agent Markov decision processes (MMDPs), as the generalization of Markov decision processes to the multi agent case, have long been used for modeling multi agent system and are used as a suitable framework for Multi agent Reinforcement Learning. In this paper, a generalized learning automata based algorithm for finding optimal policies in MMDP is proposed. In the proposed algorithm, MMDP ...
متن کاملDynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)
In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...
متن کاملCooperative Behavior Acquisition in Multi Mobile Robots Environment by Reinforcement Learning Based on State Vector Estimation
This paper proposes a method that acquires the purposive behaviors based on the estimation of the state vectors. In order to acquire the cooperative behaviors in multi robot environments, each learning robot estimates local predictive model between the learner and the other objects separately. Based on the local predictive models, robots learn the desired behaviors using reinforcement learning....
متن کامل